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Introduction.  These  notes  form  a  brief  but  reasonably  complete  account  of  the 
ideas  underlying  resolution.  We  try  to  give  equal  emphasis  not  only  to  the  logical 
principles  involved  but  also  to  the  computational  issues  which  arise.  We  therefore 
study  in  some  detail  the  most  useful  symbolic  algorithms  (in  particular,  since  it  is  of 
greatest  importance,  the  unification  algorithm)  and  treat  logical  formulas  explicitly  as 
carefully  engineered  data  structures.  One  of  our  aims  is  to  explain  resolution  in 
general  in  such  a  way  that  the  important  special  case  of  Horn  clause  resolution  can 
be  properly  understood  within  the  broader  setting.  Horn  clause  resolution  is  the 
theoretical  framework  for  the  kind  of  logic  programming  which  is  done  by  users  of 
PROLOG.  Indeed  many  of  the  ideas  we  shall  discuss  are  concretely  realized, 
although  not  always  in  the  purest  form,  in  various  versions  of  PROLOG.  We  shall 
not  be  concerned  specifically  with  PROLOG,  however,  since  the  surface  details 
often  vary  considerably  from  version  to  version  and  are  often  complex  enough  to 
hide  the  relatively  simple  conceptual  system  which  lies  just  below  the  surface./)^  p.  ) 

Formal  symbolic  computation.  Expressions  as  data  structures.  Logic 
programming  is  a  technique  for  specifying  formal  symbolic  computations,  that  is, 
computations  with  formal  symbolic  expressions  as  data  objects.  Logicians  have 
long  dealt  with  expressions  in  this  way,  and  in  computer  science  one  must  adopt  the 
same  approach  in  dealing  formally  with  programming  languages.  In  formal  logic 
one  deals  with  expressions  as  formal  structured  objects,  to  be  disassembled  and 
constructed  according  to  mathematically  precise  procedures.  Thus  in  particular  in 
the  automatic  theorem  proving  problem  for  the  predicate  calculus  one  studies 
algorithms  which,  when  given  (as  input)  a  sentence  S  will  construct  (as  output)  a 
proof  P  of  S  (provided  that  S  is  logically  true).  Such  an  algorithm  may  sometimes 
(but  not  always,  because  of  the  impossibility  of  a  decision  procedure  for  logical  truth 
in  the  predicate  calculus)  be  able  to  indicate  correctly  that  S  is  not  logically  true  if 
that  is  indeed  the  case.  Both  S  and  P  have  to  be  treated  as  data  structures. 

In  most  theorem  proving  or  logic  programming  applications  the  formal  expressions 
will  have  an  informal  (perhaps  even  a  formal)  semantics  -  a  logic  programmer 
seeking  to  generate  a  set  of  'answer  expressions"  or  a  mathematician  seeking  to 
demonstrate  a  theorem  will  usually  mean  something  by  them  -  but  in  formal 
theorem  proving  and  in  logic  programs  the  object  expressions  are  treated  purely 
formally,  that  is  to  say,  as  structured,  manipulable  syntactic  objects  of  whose 
meaning,  if  any,  no  "official"  notice  is  taken  at  all.  Only  their  (abstract)  form  is  used 
as  a  basis  for  both  their  analysis  and  synthesis.  This  point  of  view  will  of  course 
already  be  especially  familiar  to  those  who  have  used  LISP  or  (of  course)  PROLOG. 
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Formal  expressions.  Atoms  =  constants  +  variables.  Indeed  in  dealing 
formally  with  expressions  we  find  it  very  convenient  to  use  the  universal  but  simple 
and  convenient  ontology  of  LISP.  Accordingly  we  shall  take  expressions  to  be 
objects  generated  from  two  countably  infinite,  disjoint  sets,  the  set  CONST  of 
constants  and  the  set  VARS  of  variables,  by  the  single  binary  operation,  dot.  The 
constants  and  variables  are  also  known  as  atoms  .  Thus  we  have  a  countably 
infinite  set  ATOMS  which  is  just  CONST  u  VARS.  Constants  and  variables  are 
usually  concretely  represented  by  finite  strings  over  some  suitable  alphabet  of 
characters.  It  does  not  then  much  matter  what  these  characters  are,  but  to  avoid 
confusion  the  various  bracketing  characters  <  >  {  }  (  )  [  ]  should  not  be  among 
them,  nor  the  space,  nor  the  the  comma,  nor  the  character  reserved  for  the  dot 
operation:  •  .  In  what  follows  we  shall  write  variables  as  strings  of  lower  case 
letters,  possibly  subscripted,  as  for  example:  x,  reverse,  y2.  All  other  strings  are 
constants  (this  includes  numerals,  arithmetic  operators,  strings  containing  upper 
case  letters,  and  so  on).  So:  atoms  are  expressions.  But  there  is  another  kind  of 
expression,  the  so-called  cons. 

Conses.  If  A  and  B  are  expressions,  then  so  (to  use  the  terminology  of  LISP)  is 

the  cons  P  whose  car  is  A  and  whose  cdr  is  B,  and  we  write: 

P  =  [•  A  B]  ( "Pis  the  cons  of  A  and  B"  ) 

aP  *  A  (  "the  car  of  P  is  A" ) 

dP  *  B  (  "the  cdr  of  P  is  B" ). 

Remarks:  "cons"  rhymes  with  "once"  and  is  singular;  "cdr"  is  pronounced  "kidder" 
by  some  and  "kudder"  by  others;  the  point  of  these  improbable  coinages  is  to 
preserve  such  words  as  "pair",  "head",  tail"  for  other  purposes,  and  to  mark  and 
enforce  the  distinction  between  the  general  syntax  of  expressions-as-such  and  the 
particular  syntax  (to  which  we  shall  soon  come)  of  the  predicate  calculus.  The 
original  LISP  convention  was  to  write  the  dot  infixed,  [A  •  B]  rather  than  [•  A  B].  In 
general  [•  A  B]  is  the  same  expression  as  [•  B  A]  only  when  A  and  B  are  the 
same  expression.  Nor  is  the  expression  [•  A[»  B  C]]  ever  the  same  as  the 
expression  [•[•  A  B]C]. 


Lists.  The  general  syntax  also  contains  the  notion  of  a  list.  Certain  constants  are 
given  special  roles,  most  notably  the  constant  NIL,  whose  role  as  the  emotv  list  is 
introduced  in  the  following  general  definition,  in  which  we  define  several  notions 
simultaneously: 

•  the  constant  NIL  is  a  (1st;  moreover  it  is  the  (only)  empty  list,  and  it  has 

length  0; 

•  for  any  list  L  of  length  n  i  0  and  any  expression  A,  the  cons  [•  A  Lj  is  a  list 
of  length  n+1;  the  1st  element  of  [•  A  L]  is  A  and  the  (j+l)st  element  of 
[•  A  L]  is  the  jth  element  of  L,  for  j*1 , . . .,  n. 
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So  NIL  is  the  only  list  which  is  an  atom  All  lists  of  nonzero  length  are  conses.  Of 
course,  not  all  conses  are  lists:  only  (but  all)  those  conses  whose  cdrs  are  lists  are 
themselves  lists. 

List  notation.  Although  the  dot  notation  is  in  principle  completely  adequate  for  all 
purposes,  we  shall  mostly  be  dealing  with  lists  and  will  therefore  use  LISP'S  more 
intuitive  and  flexible  list  notation.  This  is  based  on  the  further  convention  that 
allows  nested  conses  of  the  form 

[•A,[«A2  ...[-An NIL]...]], 

namely  nonempty  lists,  to  be  written  without  the  dots  or  interior  parentheses,  thus: 

[A,  A2  . . .  An]. 


The  list  notation  also  allows  NIL  to  be  written  alternatively  and  intuitively  as:  [  ]. 

So  to  sum  up  the  general  syntax  of  expressions  we  have: 

expressions  =  conses  +  atoms 

s  conses  +  constants  *  variables. 


Substitutions.  We  shall  have  much  to  do  with  certain  mappings  of  expressions  to 
expressions,  called  substitutions.  Intuitively,  the  image  of  an  expression  E  under  a 
substitution  0  is  the  expression  which  we  can  construct  by  replacing  each 
occurrence  of  each  of  a  certain  set  of  variables  within  E  by  another  expression 
(possibly  by  another  variable).  Each  occurrence  of  the  same  variable  is  replaced  by 
an  occurrence  of  the  same  expression  (called  the  value  of  that  variable  under  6). 
More  precisely,  the  behavior  of  a  substitution  0  is  completely  defined,  because  of 
the  "evaluation"  rule  given  below,  by  its  behavior  on  the  variables. 


Formal  definition  of  application  of  substitutions.  For  each  expression  E. 
the  expression  E0  onto  which  0  maps  E  is  called  the  instance  of  E  bv  0  The 
following  "instantiation"  rule  then  says,  intuitively,  that  the  expression  E0  is  the 
result  of  simultaneously  replacing  each  variable  in  E  by  the  expression  which  is  the 
instance  of  that  variable  by  0.  The  constants  in  E  are  left  alone.  The  rule  is  very 
simple  and  has  three  cases: 


(1)  E0 

(2) 

(3) 


E 

{•  A0  B0 ] 

the  instance  of  E  under  0 


when  E  is  a  constant ; 
when  E  is  the  cons  [•  A  B] 
when  E  is  a  variable. 

and/or 


•  ist 


fi'l 


Special 
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Bindings.  Descriptions  of  substitutions.  In  fact,  many  variables  may  be 
mapped  onto  themselves  by  6,  and  so  will  be  treated  by  part  (3)  of  the  rule  exactly 
as  part  (1)  treats  constants.  This  means  that  the  instantiation  rule  allows  us  to 
construct  E0,  for  any  expression  E,  if  we  know  the  instance  X0  of  each  variable 
X,  whose  instance  bv  6  is  different  .from  the  variable  itself.  We  say  these 
variables  are  bound  by  6,  and  that  the  couples  <  X  X0  >  are  the  bindings  of  0. 
Finally,  in  order  to  have  a  smooth,  flexible  notation  for  describing  substitutions,  we 
say  that  any  set  of  couples  of  the  form  <  X  X0  >  (where  X  is  a  variable)  which 
contains  all  the  bindings  of  0,  is  a  description  of  0.  Any  couple  in  a  description  of 
0  which  is  not  a  binding  of  0  must  have  the  trivial  form  <  X  X  >.  If  we  discard  these 
we  get  the  unique  minimal  description  of  0. 

It  is  the  usual  convention,  which  we  have  been  following  above,  to  use  lower  case 
Greek  letters  to  denote  substitutions,  and  to  indicate  their  application  to 
expressions  by  juxtaposition  on  the  right. 

We  shall  frequently  indulge  in  a  harmless  abuse  of  notation  in  which  a  substitution 
is  identified  with  a  description  of  it.  Thus,  we  shall  write,  for  example, 

0  =  { <  y  [A  x  ]  >,  <  x  5>,  <  2  [B  y )  > } 

to  mean  that  0  is  the  substitution  described  bv  the  right  hand  side.  Then  for 
example  we  can  easily  verify  that 

[A  (Bxy ]  [Cxyzwjje  .  [A  [B  5  [Ax]]  [C  5  [Ax]  [B y ] w ]]. 

We  shall  also  find  it  convenient  to  extend  the  instantiation  notation  and  write  S0, 
where  S  is  a  sal  of  expressions,  to  denote  the  set  of  all  expressions  X0,  where  X  is 
in  S.  It  is  very  important  then  to  note  that  the  set  S0  may  have  fewer  members  than 
the  set  S,  since  it  is  possible  that  X0  and  Y0  should  be  the  same  expression 
even  though  X  *  Y.  Indeed,  this  is  what  is  meant  by  saying  that  X  and  Y  are 
unified  by  0. 

Composition  of  substitutions.  The  product  0A  of  two  substitutions  0  and  \ 
is  simply  their  composition  as  mappings,  that  is,  Bk  is  the  substitution  which  maps 
each  expression  E  onto  the  expression  (E0)A. .  Thus  the  product  operation  is 
associative,  and  has  an  identity,  namely  the  substitution  which  maps  every 
expression  onto  itself.  The  usual  notational  convention,  which  we  shall  follow,  is  to 
denote  the  identity  substitution  by  the  lower  case  Greek  letter  e  .  Note  that  the 
empty  set  { }  is  then  a  description  of  e.  Not  all  substitutions  are  bijections,  of 
course,  but  when  0  is  a  bijection  we  shall  write  0'1  for  its  inverse.  We  then 
have  00'1  =  O*1©  =  e.  A  bijective  substitution  is  called  a  change  of  variables. 

Descriptions  of  products.  Since  in  computations  substitutions  are  represented 
by  their  descriptions,  we  shall  often  need  to  construct  a  description  of  a  product 
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0X,  given  descriptions  of  0  and  X.  We  then  use  the  following  easily  verified  fact: 

•  if  V  and  W  are  the  sets  of  variables  bound  by  0  and  X 
respectively,  then  the  set  of  couples 
{<X  (X0)X>  |  Xin(VuW)} 
is  a  description  of  0X. 

In  order  to  write  a  minimal  description  of  0X  we  first  construct  this  set  and  then  drop 
from  it  the  trivial  couples  (if  any)  of  the  form:  <V  V>.  So,  for  example,  the  product  of 
the  substitution 

0  =  {<x[Axy]>,  <  v  w  > } 

with  the  substitution 

X  =  {<  x  [C  y]  >,  <  y  [B  z  x]  >,  <  w  v  >} 
is  (dropping  the  trivial  couple  <  v  v  >)  the  substitution 
ex  =  {< x  [A [C y] [B z x]] >,  <y  [Bzx]>} 
and  their  product  in  the  reverse  order  is  the  substitution 
Xe  =  {<x  [Cy]>,  <y  [B  z  [A  x  y]j  >}. 

Of  course,  as  this  example  illustrates,  substitutions  do  not  in  general  commute. 

Unification.  The  central  concept  of  the  theory  of  instantation  is  that  of  unification. 
Consider  the  following  unification  problem: 

•  Given  any  finite  set  S  of  expressions  find  (if  one  exists)  a  substitution  0 
which  maps  every  expression  in  S  onto  the  same  expression  (or 
what  is  the  same,  a  substitution  0  such  that  S0  is  a  singleton) 
or  show  (if  no  such  substitution  exists)  that  no  such  substution  exists. 

A  positive  solution  0  of  this  problem  is  said  to  unify  ,  or  to  be  a  unifier  of,  the  set 
S  and  the  set  S  is  said  to  be  unifiable.  The  expression  in  the  singleton  S0  is  thus 
common  instance  of  all  the  expressions  in  S. 

A  unifiable  set  of  expressions  may  have  many,  even  infinitely  many,  different 
unifiers.  However,  if  in  the  above  problem  we  ask  that  the  substitution  0  be  not 
just  a  unifier  of  S,  but  a  most  general  unifier  of  S,  then  the  solution  is  essentially 
unique  if  there  is  one.  Indeed,  if  6  is  any  most  general  unifier  of  S,  then  the  set  of 
all  the  most  general  unifiers  of  S  is  just 

{  0p  |  p  is  a  change  of  variables  }. 

Most  general  unifiers.  A  substitution  o  is  a  most  general  unifier  of  a  set  S  of 
expressions  if  it  is  a  unifier  of  S  with  the  additional  property  that  any  unifier  0  of  S 
satisfies  an  equation  0  =  oX  for  some  substitution  X  .  Intuitively:  the  unifier  0  is  ' 
simply  an  instance  (or  special  case)  of  a  most  general  unifier  of  S.  Of  all  the 
ideas  needed  for  an  understanding  of  resolution  and  logic  programming  this  idea 
of  most  general  unification  is  undoubtedly  the  most  important. 
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Example  of  unification.  The  set  {  A,  B  }  whose  two  members  are  the 
expressions 

A  =  [ [P x y u]  [Pyzv]  [Pxvw]  (P [K t] t [K t] ]] 
and 

B  =  [  [P  [G  r  s]  r  s]  [Pa[Hab]b]  [Pxyu]  [Puzw]] 
is  unified  by  the  substitution 

o  =  {  <  x  [G  r  [K  [H  r  r]]]  >  <  y  r  >  <  z  [H  r  r]  > 

<  u  [K  [H  r  r]]  >  <  v  r  >  <  w  [K  [H  r  r]]  > 

<a  r>  <b  r>  <  s  [K  [H  r r]j  >  <t  [H  r r]  >}, 

as  may  be  readily  verified  by  applying  o  to  A  and  B  and  comparing  the  results. 
Indeed®  maps  both  AandB  onto  the  expression: 

[  [P  [G  r  [K  [H  r  r]]]  r  [K  [H  r  r]]] 

[P  r  [H  r  r]  r] 

[P[Gr[K[Hrr]]]r[K[Hrr]]] 

[P  [K  [H  r  r]]  [H  r  r]  [K  [H  r  r]]]  ]. 

Now,  as  it  happens,  a  is  also  a  most  general  unifier  of  {A  B}:  any  other  unifier  6  of 
{A  B}  is  a  product  o{<  r  E  >}  of  a  with  a  substitution  which  maps  the  variable  r  onto 
any  expression  E  whatsoever.  In  particular,  if  E  is  a  variable,  then  6  will  also  be  a 
most  general  unifier  of  A  and  B. 

Examples  of  nonunifiable  sets.  On  the  other  hand,  for  example,  the  set 
{[Pxyu]  [Q a b c] } 

is  not  unifiable.  It  is  not  difficult  to  see  why  this  is  so:  any  unifier  would  have  to 
unify  the  two  constants  P  and  Q,  which  is  impossible.  Again,  the  set 
{ [F  x]  x } 

is  not  unifiable:  any  unifier  would  have  to  map  the  variable  x  into  an  expression 
which  contained  itself  as  a  proper  subexpression,  which  is  impossible.  These  two 
reasons  for  nonunifiability  turn  out  to  be  the  only  two  that  there  are. 

A  simple  unification  algorithm.  We  next  give  a  simple  binary  unification 
algorithm  which  solves  all  "binary"  unification  problems  by  finding  a  most  general 
unifier  for  any  set  of  two  expressions  (if  the  set  is  unifiable),  or  detecting  its 
nonunifiability  (if  it  is  not  unifiable).  Bigger  finite  sets  can  then  be  handled 
iteratively,  by  using  the  (easily  verified)  fact  that 

•  0X  is  a  most  general  unifier  of  a  set  S  of  n  ^  2  expressions  among 
which  are  A  and  B,  if  0  is  a  most  general  unifier  of  the  set  (A,  B),  and 
A.  is  a  most  general  unifier  of  the  set  S0. 

Note  that  if  (A,  B)  is  unifiable  then  S0  will  have  at  most  (n-1)  elements,  since 
A0  *  B0,  and  thus  the  iteration  will  stop  after  at  most  (n-1)  steps. 
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Differences  between  expressions.  To  help  in  formulating  the  binary 
unification  algorithm  in  a  simple  and  intelligible  (but  inefficient)  way  we  shall  use 
the  notion  of  the  difference  A(X,Y)  between  any  two  expressions  X  and  Y. 

Intuitively,  the  difference  between  X  and  Y  is  the  set  of  all  unordered  pairs  of 
expressions  which  occur  opposite  each  other  at  corresponding  positions  in  X  and 
Y  where  X  and  Y  are  not  the  same.  We  have  the  following  recursive 
characterization: 

A(X,Y)  =  { }  if  X  and  Y  are  the  same  expression, 

=  A(aX,  aY)  u  A  (dX,dY)  if  A  and  B  are  both  conses, 

=  {{X,Y}}  otherwise. 

Negotiability.  Reductions.  Such  a  difference  is  said  to  be  negotiable  if 

(1)  it  is  nonempty 

(2)  each  pair  in  it  has  the  property  that  at  least  one 
of  its  members  is  a  variable  and  neither  member 
occurs  in  the  other. 

Thus,  because  of  this  condition,  for  every  member  (U,  V}  of  a  negotiable 
difference,  at  least  one  (and  possible  both)  of  {<U  V>)  or  {<V  U>)  must  be  a 
substitution.  These  substitutions  are  called  reductions  of  the  difference. 

Examples  of  differences  and  their  reductions.  The  difference 
A([Pxyz],  [Qabc]) 
is  the  set 

{(P,Q)  (x,  a)  (y.b)  {z,  c) ). 

This  is  noi  negotiable,  because  it  contains  the  pair  (P.O)  and  neither  of  P,  Q  is  a 
variable.  Nor  is  the  difference 

A([Fx],x)  =  {«Fx),x}} 

negotiable,  because  x  occurs  in  [F  x].  On  the  other  hand,  the  difference 
A([Px],[yQ])  -  { {P,  y}  {x,Q}} 
is  negotiable,  and  has  two  reductions 

p  =  {  <  y  p  > }  and  v  ■  { <  x  Q  > }. 

Unifiers  eliminate  differences.  The  intuitive  content  of  the  following 
proposition  is  then  that  the  difference  between  distinct  but  unifiable  expressions  is 
always  eliminable: 

•  Negotiability  Lemma.  If  A  and  B  are  distinct 
expressions,  and  8  unifies  {A,  B),  then  A  (A,  B)  is 
negotiable,  and  0  unifies  each  member  of  A  (A,  B): 


We  now  state  the  (slow!) 
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•  Binary  Unification  Algorithm,  for  two  expressions  A,  B  as  input: 

1  e; 

2  while  A  (A  a.B  a)  is  negotiable  do  c  op 
where  p  is  any  reduction  of  A  (A  o,B  o); 

3  return  (If  A  ( A  o,B  o)  is  empty  then  o  else  ‘FAIL’). 

We  are  assured  that  this  algorithm  deserves  its  name  by  the 

Binary  Unification  Theorem.  Let  A  and  B  be  any  two 
expressions.  Then  {A  B}  is  unifiable  if  and  only  if  the  above  Binary 
Unification  Algorithm  terminates,  when  applied  to  A  and  B  as  input, 
without  returning  "FAIL".  The  o  then  returned  is,  moreover,  a 
most  general  unifier  of  {A,  B}. 

The  idea  of  the  proof:  termination.  At  each  repetition  of  the  loop  in  step  2  the 
number  of  distinct  variables  in  the  expressions  Ao,  Bo  decreases  by  1  (since  each 
successive  reduction  {<U  V>}  eliminates  the  variable  U  in  view  of  the  fact  that  U 
does  not  occur  in  V).  Hence  step  2  must  terminate  after  no  more  repetitions  than 
there  are  distinct  variables  in  the  input  expressions  A  and  B. 

The  Idea  of  the  proof:  correctness.  If,  at  step  3.  A(Ao,  Bo)  is  empty,  then 
(obviously)  the  o  returned  as  output  is  a  unifier  of  A  and  B,  but  it  must  be  shown 
that  it  is  a  most  general  unifier  of  A  and  B.  Well:  this  follows  from  the  fact  that  for 
every  unifier  8  of  {A,B}  the  equation 
9  »  ofl 

is  an  invariant  of  the  computation.  It  clearly  (indeed,  trivially)  holds  after  step  1 .  We 
can  then  see  that  it  is  preserved  throughout  step  2  if  we  note  that  for  any  reduction 
p  of  A  (Ao.Bo )  the  equation  8  «  p0  holds,  allowing  us  to  calculate: 

0  -  ofl  -  a  (p8)  -  (op)8. 

Since  a  is  replaced  by  op  at  each  repetition  the  equation  8  =•  ofl  is  an  invariant. 
Thus  for  every  unifier  0  we  have  0  ■  oX,  with  X  -  8. 

Faster  binary  unification  algorithms.  The  slow  binary  unification  algorithm 
given  above  has  the  pedagogical  and  theoretical  advantage  of  being  concise  and 
intuitive.  However,  it  is  indeed  slow.  It  is  very  inefficient  in  both  time  and  space. 
For  example,  to  unify  with  this  algorithm  the  two  expressions 

I  IFxoXoJ  [F x,  x,]  [FxjxJ  ...  IF xn.i  xn-i] ] 

l  *1  Xj  Xg  ...  xn  ) 

requires  time  and  space  proportional  to  2n.  It  is  easy  to  see  that  the  most  general 
unifier  is 
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{<x,  [F  x<j  Xq]> 

<x2  [F  [F  Xo  Xq]  [F  Xq  Xq]  ]> 

<x3  [F  [F  [F  Xo  Xq]  [F  Xq  Xq]  ]  [F  [F  Xo  Xq]  [F  Xq  Xq]  ] 

<xn  (an  expression  containing  *F  2n+1  times  and  "x0"  2n  times)  >  } . 

Fortunately  it  is  possible  to  exploit  other  aspects  of  unification  in  such  a  way  that 
much  faster  unification  algorithms  can  be  devised.  The  general  idea  behind  all  of 
them  is  based  on  the  equivalence  relations  which  are  induced  on  sets  of 
expressions  by  substitutions. 

Occurrence  graphs.  If  S  is  any  set  of  expressions,  the  extant  of  S  is  the 
smallest  set  of  expressions  which  includes  S  and  contains  A  and  B  whenever  it 
contains  [•  A  B].  The  occurrence  graph  of  S  is  then  the  directed  graph  whose 
nodes  are  the  singleton  sets  {X}  where  X  is  an  expression  in  the  extent  of  S,  with 
an  arc  labelled  a  (respectively  d)  leaving  the  node  {[•  A  B]}  and  impinging  on  the 
node  (A)  (respectively  (B)).  If  X  is  a  constant  or  a  variable  the  the  node  (X)  has  no 
arc  leaving  it.  Clearly,  the  occurrence  graph  of  S  is  acyclic,  but  is  not  necessarily  g 
tree  because  of  the  possibility  of  'sharing*. 

For  each  substitution  8  we  define  the  equivalence  relation  Eg  by: 

X  *e  Y  if  and  only  if  X8  -  Y8. 

Now  if  in  the  occurrence  graph  G  of  S8  we  replace  each  of  its  nodes  {X8}  by  the 
^-equivalence  class  in  which  X  lies,  we  get  a  graph  H  which  is  isomorphic  to  G. 
Since  G  is  acyclic,  so  is  H.  Let  us  call  H  the  occurrence  graph  of  S/=g .  Intuitively, 
H8  =  G.  The  equivalence  relation  *0  therefore  satisfies  the  two  properties: 

no  cycles  the  occurrence  graph  of  S/*0  is  acyclic 
structure  for  all  distinct  X,  Y  in  S: 

If  Xs9y  then  either  (X,  Y}  contains  a  variable 
or  X,  Y  are  both  dotted  pairs 

and  aX  =eaY  and  dX  =8  dY. 

The  converse  also  holds:  any  equivalence  relation  a  on  the  extent  of  S  which 
satisfies  the  two  conditions  no  cycles  and  structure  is  inducible  on  the  extent  of  S 
by  a  substitution.  Indeed,  a  most  general  substitution  o  which  induces  ■  on  the 
extent  of  S  can  be  immediately  constructed  from  ■  . 

Set  union  algorithm.  The  key  idea  of  the  arious  known  fast  binary  unification 
algorithms  is  to  seek  to  construct,  given  distinct  expressions  A  and  B,  an 
equivalence  relation  on  the  extent  of  (A  B}  satisfying  the  above  properties.  First, 
one  constructs,  if  possible,  an  equivalence  relation  e  satisfying  structure,  for 
which  A  ■  B.  This  is  done  by  imitating  the  method  of  the  very  fast  (almost  linear  in 
time  and  linear  in  space)  'set  union  algorithm’  analyzed  by  R.  E.  Tarjan  (14).  This 
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relation  s  is  then  checked  for  the  no  cycles  property  by  the  (linear  in  time  and 
space)  "topological  sort"  method  described  by  Knuth  [9]. 

The  remaining  details  are  too  many  for  a  complete  discussion  in  these  notes.  A 
recent  paper  by  J.  S.  Vitter  and  R.  A.  Simmons  [1 6]  contains  a  full  account,  and 
also  discusses  the  potential  speed-up  of  this  algorithm  if  advantage  is  taken  of  the 
opportunities  it  offers  for  parallel  computations.  Gerard  Huet  described  this  idea 
already  in  his  thesis  [7],  and  there  are  also  papers  by  Paterson  and  Wegman  [1 1  ], 
and  by  Martelli  and  Montanari  [10],  which  give  strictly  linear  algorithms.  The 
following  brief  sketch  will  give  a  feel  for  this  general  "set  union’  approach. 

Equivalence  classes  represented  by  trees.  In  the  set  union  algorithm,  the 
classes  of  an  equivalence  relation  on  a  set  are  represented  as  trees  in  a  forest 
defined  by  a  set  E  of  equations  between  distinct  elements  of  the  set,  no  two 
equations  in  E  having  the  same  left  hand  side.  If  E  contains  the  equation  U  =  V 
we  say  that  E  equates  U  to  V.  and  we  think  of  U  and  V  as  being  nodes  in  the 
same  tree,  with  V  the  parent  of  U.  The  function  ROOT  is  defined  for  any  element 
A  of  the  set  and  such  a  collection  E  of  equations  as  arguments,  and  finds  the  root 
of  the  tree  in  which  A  lies: 

ROOT  A  E  »  if  E  equates  A  to  B  then  ROOT  B  E  else  A. 

The  equivalence  relation  «E  represen' ed  by  E  is  then  that  for  which 
X  2j:Y  Iff  ROOTXE  -  ROOTY E. 

The  structure  part  of  the  algorithm  is  then  embodied  in  two  cooperating  functions 
EQUIV  and  EQUATE.  In  general  the  function  EQUIV,  when  given  expressions  A, 
B,  as  its  first  and  second  arguments  and  a  collection  E  of  equations  as  its  third 
argument,  returns  either  E  itself,  if  A  »E  B,  or  a  set  E+  of  equations  obtained  by 
adding  to  E  the  equations  required  for  E+  to  satisfy  structure  and  for  A  =  E+  B.  If 

no  such  E+  exists,  then  EQUIV  returns  the  message  "FAIL".  The  function 
EQUATE  has  a  similar  behavior,  but  assumes  that  E  is  not  the  message  "FAIL" 
and  that  A  and  B  are  roots  of  E. 

We  then  define  UNIFY  by 

(UNIFY  A  B)  -  (EQUIV  A  B  { } ) 

where  EQUIV  and  EQUATE  are  defined  by 


EQUIV  ABE 

else 

if 

E  is  ’FAIL’  then  ’FAIL’ 

EQUATE  (ROOT  A  E)  (ROOT  B  E)  E 

EQUATE  A  BE 

■ 

if 

A-B  then  E 

1 

else 

If 

(A.B)  contains  a  variable  then  MERGE  ABE 

2 

else 

If 

{A,B}  contains  a  constant  then  "FAIL" 

3 

else 


EQUIVaAaB  (EQUIVdAdB  (MERGE  A  BE)) 
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Since  EQUATE  is  called  by  EQUIV,  the  input  assumption  for  EQUATE  is  correct 
that  E  is  not  the  message  "FAIL"  and  that  A  and  B  are  roots  of  E.  The  action  in  line 

3  of  EQUATE  is  appropriate  to  having  discovered  that  structure  cannot  be 
satisfied,  since  in  view  of  the  tests  in  lines  1 ,  2  and  3,  one  of  A,  B  must  be  a 
constant  while  the  other  must  be  either  a  (distinct)  constant  or  else  a  cons.  The 
action  in  lines  2  and  4  is  to  merge  the  two  trees  (i.e.  equivalence  classes)  into  a 
single  tree  (equivalence  class)  by  equating  one  root  to  the  other.  The  action  in  line 

4  is  appropriate  to  having  found  that  both  A  and  B  are  conses,  so  that  not  only  are 
the  two  trees  (equivalence  classes)  merged,  but  the  resulting  forest  (equivalence 
relation)  is  modified  by  such  further  mergings  of  equivalence  classes  as  may  be 
necessary  to  satisfy  structure. 

If  we  use  the  following  simple  definition  for  MERGE 

MERGE  A  B  E  s  if  A  is  a  variable  then  (A=B)  u  E  else  {B=A}  u  E 

then  the  calls  of  MERGE  from  EQUATE  preserve  the  property  that 

•  if  a  tree  representing  an  equivalence  class  has  a  variable 

as  its  root,  then  all  expressions  in  that  equivalence  class  are  variables. 

Balancing  and  path  compression.  However,  this  simple  method  of  merging 
foregoes  the  opportunity  to  imitate  fully  the  set  union  algorithm,  which,  when 
merging,  balances  the  resulting  tree  bv  making  the  root  of  the  taller  tree  the  parent 
of  the  root  of  the  shorter  tree.  The  set  union  algorithm  also  compresses  the  trees: 
each  call  (ROOT  A0  E)  returns  a  root  An  by  accessing  a  sequence  of  equations 
Ai  =  AM,  (i  =  0 . n-1) 

in  E  (a  "path").  Each  of  these  equations  is  replaced  by  the  equation  A,  =  An, 

making  the  root  the  parent  of  each  node  Aip  (i  =  0 . n  - 1)  (the  path  is 

"compressed").  By  complicating  the  fast  unification  algorithm  slightly  to  reflect 
these  two  refinements,  an  essentially  linear  cost  performance  can  be  achieved. 

Logic.  After  all  these  preliminaries  we  can  now  get  to  our  main  topic  and  describe 
the  clausal  predicate  calculus.  This  is  is  a  special  machine-oriented  version  of 
the  predicate  calculus,  in  which  one  deals  principally  with  only  one  form  of 
proposition  tfhe  clausal  sequent)  and  uses  only  one  inference  principle  by  which 
to  prove  the  truths  among  such  propositions  (the  resolution  principle). 


Clausal  sequents.  A  clausal  sequent  is  an  intuitively  meaningful  proposition 
which  asserts  that  a  given  set  of  universal  disjunctive  clauses  say,  the  set  P, 
logically  entails  a  given  set  of  existential  conjunctive  clauses,  say,  Q. 

Sequent  notation.  We  write  the  sequent  by  joining  the  two  sets  with  a  sequent 
arrow,  thus: 
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P  =>  Q 

and  we  say  that  P  is  the  antecedent,  and  Q  the  succedent.  of  the  sequent.  We 
read  the  sequent  as  *P  logically  entails  Q"  or  as  *Q  logically  follows  from  P“.  If  Q 
happens  to  be  the  empty  set  we  can  write  the  sequent  simply  as 
P  => 

and  read  it  as  *P  is  logically  unsatisfiable".  We  often  write  the  antecedent  and 
succedent  sets  by  simply  listing  their  clauses  in  some  order  (the  order  being 
irrelevant)  and  omitting  the  external  set  brackets.  Thus,  we  write 
A,  B,  C  =»  M,  N 
rather  than 

{A,  B,  C)  =*{M,N). 

Meaning  of  a  clausal  sequent.  Pending  the  more  exact  definitions  to  be 
given  below,  we  can  say  that  the  sequent  P  =>  Q  expresses  the  claim  that  there  is 
no  interpretation  of  the  predicate  symbols  and  function  symbols  of  the  clauses  in  P 
and  Q  under  which  all  the  clauses  in  P  are  true  and  all  those  in  Q  are  false. 

When  Q  is  a  singleton  (C)  this  accords  well  with  our  everyday  understanding  of 
what  it  means  to  say  that  C  logically  follows  from  (the  sentences  in)  P:  "C  must  be 
true  whenever  the  sentences  in  P  are  true". 

Universal  disjunctive  and  existential  conjunctive  clauses.  A  universal 
disjunctive  (  u.d.)  clause  is  a  formal  sentence  of  the  form: 

Vx1 ...  Xj,  (-A,  v ...  v  -Ap  v  B,  v  . . .  v  Bq ) 
which  is  equivalent  to 

Vx,  ...  x„  ((A,  a.  . .  a  Ap)— »(B,  v...  v  Bq)) 
while  an  existential  conjunctive  (e.c.)  clause  is  one  of  the  form: 

Bx1  ...  Xfc  (  A,  a  . .  .  a  Ap  a  ~B,  a  . . ,  a  ~Bq  ) 
which  is  equivalent  to 

Bx,  ...  x,,  (  (A,  a.  . .  a  Ap )  a  ~(B,  v  .  . .  v  Bq ) ). 

In  each  clause,  the  expressions  A,  , ... ,  Ap  and  Bt  , ...  .  Bq  are  predications 
(defined  below),  and  the  x,, ... ,  xK  are  all  the  distinct  variables  which  occur  in  them. 
For  both  kinds  of  clause  we  say  that 

•  the  set  {x, . . .  xj  is  the  prefix  of  the  clause, 

•  the  set  (A, . . .  Ap)  is  the  body  of  the  clause,  and 

•  the  set  (B1 . . .  Bq)  is  the  head  of  the  clause. 

A  clause  with  an  empty  prefix  (i.e.  which  contains  no  variables)  is  called  a  ground 
clause. 

Empty  clauses.  When  both  head  and  body  are  empty  then  (necessarily)  also  the 
prefix  is  empty,  and  the  clause  itself  is  then  said  to  be  the  empty  clause  of  the  one 
kind  or  the  other.  The  empty  u.d.  clause  is  equivalent  to  the  formal  sentence  false, 
and  the  empty  e.c.  clause  is  equivalent  to  the  formal  sentence  true.  The  formal 
sentence  falsa  is  false  in  every  interpretation  (see  below),  and  the  formal 
sentence  true  is  true  in  every  interpretation. 
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Th«  kernel  of  a  clause.  A  u.d.  clause  with  a  given  prefix,  body  and  head  is 
transformed  into  an  e.c.  clause  with  the  same  prefix,  body  and  head  under  the 
operation  of  interchanging  V  with  3,  a  with  v,  and  (for  each  predication  P) 
-P  with  P.  The  prefix,  body  and  head  are  invariants  of  the  transformation.  We  shall 
find  it  useful  to  define  the  basic  operations  and  notions  of  resolution  in  terms  of 
these  invariants  alone.  We  shall  call  the  triple 
<  (xi  •  •  •  X(J  {A, . . .  Ap}  (B, . . .  Bq}  > 

consisting  of  the  prefix,  body  and  head  of  a  clause  C,  the  kernel  of  C,  regardless 
of  whether  C  is  a  u.d.  clause  or  an  e.c.  clause.  It  is  convenient  to  think  of  the  u.d. 
clause  with  kernel  <X  A  B>  as  the  tuple  <V  X  A  B>,  and  the  e.c.  clause  with 
kernel  <X  A  B>  as  the  tuple  <  3  X  A  B  >.  Notice  that  when  a  u.d.  clause  and  an 
e.c.  clause  have  the  same  kernel  then  each  is  logically  equivalent  to  the  negation 
of  the  other.  It  has  been  the  usual  practice  in  clausal  predicate  calculus  to  deal 
only  with  u.d.  clauses,  and  to  call  these  simply  clauses  without  qualification. 
However,  in  our  opinion  this  practice  leads  to  unnecessary  awkwardness  in  the 
later  development  of  the  resolution  principle.  By  working  with  kernels  wherever 
possible  we  are  able  to  enjoy  the  conceptual  economy  of  the  usual  treatment 
without  denying  ourselves  the  richer  and  more  natural  means  of  expression  which 
the  availability  of  both  kinds  of  clause  provides. 

Predications  and  terms.  Herbrand  Universes  and  Bases.  In  any  particular 
application,  we  shall  regard  the  clauses  and  clausal  sequents  as  all  built  ultimately 
out  of  the  variables  and  the  members  of  a  certain  set  of  constants,  each  constant  in 
which  is  classified  as  a  predicate  symbol  or  a  function  symbol  and  assigned  an 
aritv.  (Th8  arity  of  a  symbol  may  be  any  natural  number.  When  the  arity  of  a 
function  symbol  is  0  the  symbol  is  called  an  individual  svmboh.  Relative  to  such  a 
fixed  set  L  of  predicate  and  function  symbols  (sometimes  called  a  lexicon!  we 
define  certain  expressions  to  be  the  terms  over  L  and  the  predications  over  1  and 
when  these  expressions  contain  no  variables  we  say  they  are  ground  terms  over  L 
and  around  predications  over  L.  The  definitions  are: 

•  The  terms  over  L  are  the  variables,  and  also  the  lists 
[Ft,...  tn]  in  which  F  is  a  function  symbol  of  arity  n  in 
L,  and  the  t|  are  terms  over  L.  The  set  of  all  ground 
terms  over  L  is  called  the  Herbrand  Universe  over  L. 

•  The  predications  over  L  are  the  lists  [P  t,  . . .  tj  in 
which  P  is  a  predicate  symbol  of  arity  n  in  L,  and  the 

t,  are  terms  over  L.  The  set  of  all  ground  predications 
over  L  is  called  the  Herbrand  Base  over  L. 

Substitutions  extended  to  tuples.  It  is  useful  to  extend  the  substitution 
notation  not  only  (as  was  done  earlier)  to  sets  of  expressions,  but  also  to  tuples 
whose  components  are  expressions  or  sets  of  expressions,  in  the  obvious  way: 
e.g.,  the  instance  of  the  triple  <XAB>  by  the  substitution  6  is  the  triple 
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<X8A6B6>,  and  we  write  it  as:  <XAB>0. 

We  can  no«l  raaimy  SVCpfitln  what  is  meant  by  a  variant  of  a  clause. 

Variant*.  If  C  is  a  clause  with  Kernel  <X  A  B>  and  the  substitution  6  is  a  change 
of  variables,  then  C8  is  also  a  clause,  with  kernel  <XAB>8.  It  is  called  a  variant 
of  <X  A  B>. 

[NOTA  BENE:  the  substitution  operates  on  the  variables  of  the  prefix,  even  though 
they  are  from  the  logical  point  of  view  bound  variables  of  the  clause.  Our 
substitution  operations  know  nothing  about  the  meanings,  if  any,  that  we  associate 
with  expressions.] 

In  particular  every  clause  is  a  variant  of  itself  (with  6  ■  e).  Note  also  that  if  C0  is  a 
variant  of  C  then  C  is  a  variant  of  C8  since 
cee1.  Ce-  C. 

Finally,  if  A  is  a  variant  of  B  and  B  is  a  variant  of  C  then  A  is  a  variant  of  0.  Thus 
'being  a  variant  or  is  an  equivalence  relation  on  clauses. 

Separated  clause*.  Instances.  Two  clauses  are  said  to  be  separated  if  their 
prefixes  are  disjoint.  A  clause  O  is  an  instance  of  a  clause  C  if  there  is  a 
substitution  8  such  that  0  ■  C8.  (In  order  for  this  definition  to  make  sense,  the 
clauses  must  be  construed  as  tuples).  If  the  prefix  of  D  is  empty,  it  is  a  around 
instance  of  C. 

Herbrand  Interpretations  of  L.  An  Herbrand  interpretation  J  of  a  lexicon  L  is 
given  by  specifying,  for  each  predication  in  the  Herbrand  Base  of  L,  whether  it  is 
tcuftin  J  or  false  in  J. 

The  idea  is  behind  this  way  of  defining  interpretions  is  that  such  a  specification  of 
»  uth  or  falsehood  can  be  automatically  extended  to  clauses,  since  in  every 
Herbrand  interpretation 

•  the  variables  in  clauses  over  L  range  over  the  Herbrand 
Universe  of  L,  which  is  taken  as  the  domain  of  individuals 
of  the  interpretation  J;  so  that 

•  a  u.d.  (respectively,  e.c.)  clause  is  true  (respectively, 
false)  in  J  if  and  only  if  all  its  ground  instances  are  true 
(respectively,  false)  in  J;  and 

•  a  u.d.  (respectively,  e.c.)  ground  clause  with  body  A 

and  head  B  is  false  (respectively,  true)  in  J  if  and  only  if  each 
member  of  A  is  true  in  J  and  each  member  of  B  is  false  in  J. 

Counterexamples  of  clausal  sequents.  Let  S  be  a  clausal  sequent  over  the 
lexicon  L.  Then  an  interpretation  J  of  L  is  a  counterexample  of  S  if  and  only  if 
every  clause  in  the  antecedent  of  S  Is  true  in  J,  and  every  clause  in  the  succedent 
of  S  is  false  in  J. 
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Equivalence  o!  clausal  saquanta.  Two  sequents  over  L  are  equivalent  if  and 
only  if  every  counterexample  Of  either  is  also  a  counterexample  of  the  other. 

Truth  of  clausal  saquanta.  A  clausal  sequent  is  true  if  and  only  if  it  has  no 
counterexamples. 

Being  true  is  in  general  only  a  semidecidable  property  of  clausal  sequents.  There 
is  no  general  algorithm  for  detecting  the  falsehood  of  every  false  sequent,  but  there 
are,  as  we  shall  see,  sound  and  complete  systems  of  proof  for  clausal  sequents, 
and  such  a  system  of  proof  for  clausal  sequents  is  an  algorithmic  method  of 
recognizing  the  truth  of  anv  clausal  sequent  which  is  in  fact  true.  This  is  what  the 
resolution  principle  makes  possible,  in  a  reasonably  efficient  form. 

Obvious  sequents.  Some  true  clausal  sequents  can  be  immediately  recognized 
as  such:  for  example,  those  which  contain  false  in  the  antecedent  or  true  in  the 
succedent.  We  call  these  obvious  sequents.  In  general,  however,  a  true  clausal 
sequents  is  not  obviously  true,  and  we  need  some  other  means  of  establishing 
their  truth.  This  is  the  motivation  for  the  resolution  principle. 

Proving  true  clausal  sequents  by  means  of  resolution.  The  resolution 
principle  is  based  upon  the  following  construction,  involving  a  unification  algorithm, 
of  a  resolvent  of  the  kernels  of  two  separated  clauses. 

Resolvents.  Let  E,  F  be  two  separated  clauses.  Let  the  kernels  of  E  and  F  be 
<  X  A  B  >  and  <YCD>  respectively.  Then  a  clause  R  with  kernel  <  Z  M  N  >  is  a 
resolvent  of  E  with  F  on  K  if  and  only  if  K  is  a  unifiable  subset  of  BuC  with  most 
general  unifier  a,  such  that  both  K  n  B  and  K  n  C  are  nonempty,  and  such  that 

•  Z  =  (X o  u  Yo)n VARS, 

•  M  =  Aou  (Co  — Ko), 

•  N  =  (Bo — Ko)  u  Da. 

Note  that  both  the  u.d.  clause  R  with  kernel  c  Z  M  N  >  and  the  e.c.  clause  R  with 
kernel  <  Z  M  N  >  are  resolvents  of  E  with  F  on  K. 

The  resolution  principle.  The  resolution  principle  is  an  inference  principle  for 
clausal  sequents.  This  is  a  somewhat  different  point  of  view  from  the  usual  one.  In 
the  usual  treatment  [12],  resolution  is  formulated  as  an  inference  principle  for  u.d. 
clauses,  and  allows  us  to  infer  a  u.d.  clause  as  conclusion  from  two  u.d.  clauses  as 
premises.  By  restating  the  principle  for  sequents,  however,  we  uncover  more  of 
the  power  and  flexibility  of  the  underlying  idea,  and  also  allow  e.c.  clauses  to  enter 
into  resolution  reasoning  in  a  natural  way.  So  we  state  the  resolution  principle  as 
follows: 


from  a  nonobvious  clausal  sequent  S  one  may  infer 
the  clausal  sequent  S  +  R.  and  Conversely,  provided 
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that  R  is  a  resolvent  of  two  separated  variants 
of  clauses  in  S  and  provided  that  R  is  not  a  variant 
of  a  clause  in  S. 

By  S  4-  R  we  mean  the  clausal  sequent  obtained  by  adding  R  to  the  antecedent  of 
S,  if  it  is  a  u.d.clause,  or  to  the  succedent  of  S,  if  it  is  an  e.c.  clause.  Any  such 
sequent  S  +  R  is  then  said  to  be  a  resolution  of  the  sequent  S. 

Comment  on  the  definition.  The  provision  that  R  should  not  be  a  variant  of  a 
clause  in  S  eliminates  an  undesirable  source  of  redundancy  and  ensures  that  a 
given  sequent  has  only  finitely  many  essentially  different  resolutions.  The 
provision  that  the  sequent  S  be  nonobvious  removes  the  possibility  of  continuing 
to  look  for  a  proof  when  the  proof  is  in  fact  complete.  This  will  become  clear  in  the 
next  paragraph. 

Resolution  series;  resolution  proofs.  A  series  S0 . Sr  of  clausal 

sequents  in  which  is  a  resolution  of  0  £  i  <  r,  is  called  a  resolution  series. 
A  resolution  series  is  a  resolution  proof  (of  its  initial  sequent)  if  and  only  if  its  final 
sequent  is  obvious. 

Soundness  of  the  resolution  principle.  The  logical  justification  of  the 
resolution  principle  rests  on  the  fact  (not  difficult  to  establish)  that  all  seouents  in  a 
resolution  series  are  equivalent..  This  means  that  all  sequents  in  a  resolution  proof 
are  true  -  as  is  clear  from  the  fact  that  its  final  sequent,  being  by  definition  an 
obvious  sequent  ,  is  (obviously)  true.  In  particular,  then,  the  initial  sequent  of  a 
resolution  proof  is  true,  and  so  the  proof  really  is  a  proof  of  its  initial  sequent.  We 
simply  read  the  proof  backwards,  starting  with  an  obviously  true  sequent,  and 
proceeding  in  truth-preserving  steps  until  we  arrive  at  the  initial  sequent  with  the 
knowledge  that  it  too  must  be  true.  As  a  proof  system  for  clausal  sequents,  in  other 
words,  the  resolution  principle  is  sound.  The  significance  of  the  resolution  principle 
for  computational  logic  then  rests  on  two  further  properties  of  resolution,  local 
finiteness  and  completeness,  one  of  which  is  easy  to  see,  the  other  o'  which  is  not. 

Local  finltanass.  A  clausal  sequent  has  only  finitely  many  resolutions,  all  of 
which  can  be  effectively  constructed.  Hence  we  can  effectively  find  all  finite 
resolution  series  starting  with  a  given  sequent,  and  therefore  all  resolution  proofs  of 
that  sequent,  if  it  has  any.  This  fact  allows  us  to  build  resolution-proof-finding 
systems. 

Comptetanasa  of  tha  resolution  prlnclpla.  Every  true  sequent  has  at  least 
one  resolution  proof. 

This  fact  is  quite  nontrivial.  Because  of  it,  resolution-proof-finding  systems  are 
truth-recognition  systems  for  clausal  sequents. 

Horn  saquants.  The  general  resolution-proof-finding  procedure  as  sketched 
above  is  rather  attractive  compared  with  older  proof-finding  systems  for  the 
predicate  calculus.  However,  it  Is  not  yet  in  an  efficient  enough  form  for  what  we 


now  call  logic  programming.  There  are  just  too  many  resolutions  at  each  step  for  it 
to  be  feasible  to  search  out  all  resolution  proofs  of  a  given  true  clausal  sequent. 
Nevertheless  Green  [5]  was  able  to  use  the  general  procedure  to  lay  out  and 
motivate  the  main  ideas  of  logic  programming  as  we  know  them  today.  Soon 
thereafter  Colmerauer  [4]  with  his  PROLOG  system,  and  Kowalski  [8]  in  a  more 
abstract  form,  showed  that  by  sacrificing  some  generality  one  can  achieve  a 
remarkably  useful  system  of  logical  computation.  The  key  idea  is  to  work  with  a 
restricted  version  of  clausal  predicate  calculus  rather  than  the  full  system.  In  the 
restricted  system  we  consider  only  Horn  sequents,  rather  than  clausal  sequents  in 
general. 

Horn  sequents;  procedure  clauses;  goal  clauses.  A  clause  is  a  procedure 
clause  if  its  head  is  a  singleton,  and  a  goal  clause  if  its  head  is  empty.  Both 
procedure  clauses  and  goal  clauses  are  called  Horn  clauses.  In  both  kinds  of 
clauses  the  predications  in  the  body  of  the  clause  are  called  goals.  A  Horn 
sequent  is  a  then  a  clausal  sequent  whose  antecedent  contains  only  procedure 
clauses  and  whose  succedent  contains  only  goal  clauses.  A  minimal  Horn 
sequent  is  one  which  contains  only  one  goal  clause. 

Logic  programs  =  minimal  Horn  sequents.  A  minimal  Horn  sequent  S  thus 
has  the  form  P  =*  ( 3Y)  (Gt  a  . . .  a  Gn),  where  P  is  a  set  of  procedure  clauses,  and 
{G,,  ... ,  Gn}  is  a  nonempty  set  of  goals,  while  the  variables  in  Y  are  all  those  which 
occur  in  the  goals  Gj.  The  various  resolution  proofs  (if  any)  of  S  will  include  the 
very  special  LUSH  resolution  proofs  whicn  we  define  below,  but  also  many,  many 
more.  It  turns  out  that  if  S  is  true  then  not  only  does  it  have  a  resolution  proof 
(which  we  already  know)  but  it  even  has  a  LUSH  resolution  proof.  This  is  crucial 
for  logic  programming  purposes.  It  means  that  a  logic  programming  engine  need 
only  search  through  the  very  much  sparser  space  of  LUSH  resolution  series 
starting  with  a  true  minimal  sequent  S,  in  order  to  be  sure  of  finding  a  proof  of  S. 

Selection  and  removal  functions.  To  help  in  stating  the  idea  of  LUSH 
resolution  we  need  the  idea  of  a  selection  function,  namely,  a  function  T  defined 
on  every  nonempty  set  M,  whose  value  (T  M)  at  M  is  some  member  of  M.  To  each 
such  selection  function  corresponds  the  removal  function  J-,  which  when 
applied  to  a  nonempty  set  M  returns  the  set  which  is  the  result  of  removing  the 
expression  (tM)  fromM. 

LUSH  resolution  series.  Let  t  be  any  selection  function.  Given  the  resolution 
series 

So,  Si, . . .,  S|,  .... 

let  us  write  . 

Sj*i  -  S|  +  R41  (i  £  0) 

to  show  that  at  each  step  the  sequent  S^  is  obtained  by  adding  the  resolvent 

Rf+,  to  toe  sequent  S,.  Then  the  series  is  a  LUSH  resolution  series  controlled  bv  T 
provided  that 
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•  each  S)  (i  2  0)  is  a  Horn  sequent, 

•  S0  is  a  minimal  Horn  sequent  with  goal  clause  R0, 

•  0  St  0)  is  a  resolvent  with  R( ,  on  the  set  { Hj  Tc, } , 
of  (a  variant}  VX|  (A,  -»  H| )  of  some  procedure  clause 
in  S|,  where  C|  is  the  body  of  R(. 

The  clause  R,  is  called  the  active  clause  of  the  sequent  S|.  Thus  the  active 
clause  of  each  sequent  in  the  series  after  the  first  is  the  resolvent  of  some 
procedure  clause  in  the  preceding  sequent  with  the  active  clause  of  the  preceding 
sequent.  The  active  clause  in  the  initial  sequent  is  the  (only)  goal  clause  in  that 
sequent.  Note  that  in  a  LUSH  resolution  series  the  resolvents  are  always  goal 
clauses:  no  procedure  clauses  are  generated  as  resolvents.  Note  also  that  each 
procedure  clause  can  produce  at  most  one  resolvent  with  the  active  clause,  and 
will  do  so  if,  but  only  if,  its  head  can  be  unified  with  the  goal  selected  by  T  from 
the  body  of  the  active  clause.  The  rapidity  with  which  this  can  be  decided  will 
depend  on  the  complexity  of  the  particular  function  T  being  used,  on  the  method 
of  representation  of  the  sets  of  goals  to  which  it  is  applied,  and  on  the  method  of 
representing  the  procedure  clauses  in  the  initial  sequent.  PROLOG  uses 
particularly  efficient  methods,  in  which  an  order  is  imposed  on  the  goals. 
PROLOG'S  T  respects  this  ordering  in  the  sense  that  Tm  is  always  the  first  of  the 
most  recently  added  elements  of  M.  The  details  are  too  many  for  further  discussion 
here. 

The  idea  of  LUSH  resolution  is  originally  due  to  Kowalski  [8]  but  the  acronymic 
label  is  due  to  Hill  [6].  It  is  intended  to  suggest:  Linear  resolution  with  Unrestricted 
Selection  function  for  Horn  clauses.  The  LUSH  resolution  proofs  of  a  true  minimal 
Horn  sequent  are  far  fewer  in  number  than  the  ordinary  resolution  proofs  of  it. 
Many  people,  following  the  example  of  van  Emden  and  Kowalski  [1 5]  and  Lloyd 
[17],  prefer  to  use  the  acronym  SLD  (Selected  Linear  resolution  for  Definite 
clauses)  instead  of  LUSH.  However,  acronyms  are  not  to  be  multiplied,  or  even 
added,  beyond  necessity. 

LUSH  resolution  Is  complete  in  the  following  very  strong  sense:  if  S  is  a  true 
minimal  Horn  sequent,  then  for  ail  selection  functions  T  there  is  at  least  one 
LUSH  resolution  proof  of  S  which  is  controlled  by  t. 

Lush  resolution  series  as  computations.  We  can  associate  with  the  LUSH 

resolution  series  S0,  S, . S| . controlled  by  T  ,  the  computation  which  is  the 

series  C0,  C1 . C, . of  states  each  of  which  is  the  body  of  the  active  clause 

Ri  of  the  corresponding  sequent  S|.  Consider,  then,  a  LUSH  resolution  series 
whose  initial  Horn  sequent  P  =>  ( 3Y)C  provides  the  initial  state  C.  The 
relationship  between  successive  states  corresponding  to  the  successive  Horn 
sequents  in  the  series  exhibits  a  repetitive  pattern:  each  successive  state  is 
obtained  from  its  predecessor  C)  by  the  same  "computation  cycle". 
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The  computation  cycle.  To  obtain  a  successor  of  the  nonempty  state  C{  we 
take  some  procedure  clause  in  P  such  that,  for  some  suitable  variant 
<X|  A)  {Hi}>ej . 

of  its  kernel  <X,  A,  {HJ>,  the  set  {TCi(  HjGJ  is  unifiable  with  most  general  unifier  Oj. 
If  no  such  procedure  clause  exists,  then  the  state  Cj  is  a  'failure",  and  has  no 
successors.  Otherwise,  a  new  state  Cj+i  is  formed  by  the  construction 
C^-,  =  AjOjOj  u  (4-Cj)  o j . 

In  general  there  may  be  more  than  one  procedure  clause  for  which  this 
construction  can  be  made.  In  other  words,  a  state  mav  have  more  than  one 
successor 

Stacklike  behavior  of  successive  states.  This  computation  cycle  does 
indeed  abstractly  resemble  the  basic  cycle  of  a  simple  stack-oriented  computer,  if 
we  think  of  t  as  returning,  and  J.  as  removing,  the  top  element  of  the  stack.  The 
states  are  then  the  successive  contents  of  the  stack.  The  cycle  thus  consists, 
partly,  of  popping  the  top  goal  from  the  stack  and  pushing  the  goals  of  the 
procedure  body  onto  the  stack.  This  is  what  PROLOG  actually  does.  Unfortunately 
this  simple  analogy  does  not  account  for  the  application,  to  each  goal  in  the 
stack,  of  the  most  general  unifier  C|.  However,  by  means  of  an  idea  due  originally 
to  Boyer  and  Moore  in  their  Edinburgh  resolution  theorem  pro ver  [1],  we  can  find  a 
natural  computational  role  in  this  analogy  for  the  substitutions  0,  and  o,  of  this 
basic  cycle:  that  of  the  environment  of  bindings  for  the  variables. 

Implicit  representation  of  expressions.  The  idea  of  Boyer  and  Moore  is  to 
represent  an  expression  E0  implicitly  by  the  ordered  pair  <  E  6  >  instead  of 
actually  carrying  out  the  work  of  applying  0  to  E.  This  ordered  pair  can  be  thought 
of  as  a  "closure"  or  a  "delayed  evaluation".  The  ordered  pair  <  E  6  >  can  be 
treated  in  all  respects  as  though  it  were  the  explicit  expression  E6  that  it  implicitly 
represents:  for  example  the  result  E0A.  of  applying  X  to  the  expression 
represented  by  the  ordered  pair  <E  6>  is  itself  represented  by  the  ordered  pair 
«E  0>  A.  >  s 

and  so  on.  When  pairs  are  nested  to  the  left  like  this  we  follow  an  "association  to 
the  left"  convention  and  drop  the  inner  brackets.  In  general,  the  expression 
implicitly  represented  by  <  E  0,  ...  0„>  can  be  found  simply  by  carrying  out  the 
"delayed"  work  of  applying  the  successive  substitutions,  to  yield  the  explicit 
expression:  {(E0,)  ...  0„).  It  is  straightforward  to  adapt  the  procedures  UNIFY, 
EQUIV,  EQUATE,  ROOT,  MERGE,  etc.,  to  this  method  of  implicit  representation  of 
expressions.  We  can  then  use  the  Boyer-Moore  idea  to  represent  the  successive 
states  of  a  LUSH  computation.  Instead  of  actually  explicitly  constructing  the  state 
Cl+1  by  applying  the  most  general  unifier  cr  to  the  set  (A,©,  u  TCj)  we  can  simply 
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represent  C*,  by  pairing  this  set  with  a,: 

Cm  -  <  (Aj0|  utC,)  o,>. 

The  set  Ajfij  can  also  be  represented  in  the  same  way,  so  that  the  equation  can  be 
written 

Cm  =■  <  (<Aj  6i>  u  tC| )  Oj  >. 

Computations  on  the  machine  will  of  course  use  this  implicit  representation 
wherever  possible.  In  particular  the  successive  unification  substitutions  will  be 
separate  components  of  each  state.  The  successive  states  of  the  computation 
corresponding  to  a  LUSH  resolution  proof  of  the  minimal  Horn  sequent 
P  =>  ( 3Y)C 

are  then  the  following  (in  Boyer-Moore  form): 


c0 

=  <  C  e> 

Cl 

m  <  (<A,  0,> 

utC0) 

o,> 

c2 

*  <  (<A2  02> 

uTC,) 

a,  o2> 

o 

T 

-  <  (<ah.1  ek1> 

uTCj) 

a,  o2  ...  Om> 

C, 

-  <0 

°i  °2  -  °j  - 

with  the  successive  kernels  <Xi  A,  {H1}>,  .  .  .  ,  <X,  A,  {Ht}>  of  (not  necessarily 
different)  procedure  clauses  in  P  supplying  the  sets  Aj  of  new  goals  at  each  step, 
and  with  each  substitution  Oj  satisfying  the  equation: 

oj  «  (UNIFY  HjOj  (TC,)  <c, . . . oH  >) .  (jal). 

Throughout  this  LUSH  computation  no  expression  need  actually  be  constructed 
explicitly.  At  termination,  the  substitution  (a,  a2  . . .  at)  is  available  to  construct  the 
output  of  the  computation. 

The  computation  tree.  The  procedure  clause  chosen  at  each  step  of  the 
computation  is  one  of  the  only  finitely  many  occurring  in  the  antecedent  P  of  the 
sequent.  This  gives  rise  to  a  computation  space  which  is  a  finitary  (but  not 
necessarily  finite)  tree.  The  various  branches  of  the  tree  correspond  to  the  various 
LUSH  resolution  series  starting  with  the  given  initial  sequent.  The  root  of  the  tree  is 
the  body  of  the  goal  clause  (-  the  active  clause)  of  the  initial  sequent,  and  in 
general  each  node  of  the  tree  is  either 

•  empty,  and  a  leaf  of  the  tree,  (a  "success")  or 

•  nonempty  but  with  no  successors,  and  a  leaf  of  the  tree,  (a  "failure")  ac 

•  nonempty  and  with  one  or  more  successors. 

The  "success"  branches  (if  any)  of  the  computation  tree  are  the  completed 
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computations  corresponding  to  the  various  LUSH  resolution  proofs  of  the  initial 
minimal  sequent  P  =s  ( 3Y)C  the  body  C  of  whose  active  clause  (3Y)C  is  the  root 
of  the  tree. 

Each  completed  computation  yields  an  output.  It  is  then  natural  to  view  the 
equation 

Y  =  (Yo^j . . .  ot) 

as  the  output  of  the  completed  computation,  where  the  state  <  { }  o,  o2  ...  o,  >  is 
its  terminal  node.  The  term(s)  Yo1o2  .  . .  ot  are  expressions  constructed 
stepwise  by  the  successive  unifiers  o, ,  o2 , . . . ,  o,. 

Nondetermlnacy.  For  every  true  initial  sequent  S  there  is  at  least  one  such 
computation  since  the  sequent  has  at  least  one  LUSH  resolution  proof.  S  may, 
however,  have  many,  even  infinitely  many,  such  proofs;  and  for  the  computation 
corresponding  to  each  proof  there  will  be  a  possibly  different  output.  It  is  the 
purpose  of  the  various  logic  programming  engines,  such  as  a  PROLOG  machine,  to 
obtain  all  such  outputs  when  given  a  true  minimal  Horn  sequent.  This  it  does  by 
making  a  complete  exploration  of  the  search  space.  If  the  search  space  is  infinite 
then  it  may  contain  infinitely  many  success  branches  (and  it  may  also  contain 
infinitely  many  failure  branches).  A  property  designed  logic  programming  engine 
should  presumably  generate  the  search  tree  fairly,  i.e.,  in  such  a  way  (and  there 
are  many  options)  as  to  reach  any  given  node  in  the  tree  after  only  finitely  much 
time.  Unfortunately,  most  PROLOG  engines  are  designed  unfairly  (for  the  sake  of 
speed)  as  "depth-first  backtracking*  devices. 

Example  1 .  The  minimal  Horn  sequent  sequent 

•  Vx:  [NUMBER  x]  -v  [NUMBER  [1+  x]] 

•  [NUMBER  0] 

=>  3y:  [NUMBER  y] 

is  true,  and  has  infinitely  many  LUSH  resolution  proofs. 

The  outputs  of  the  corresponding  computations  are  the  equations: 
y  -  0 

y  -[HO] 

y-  IM1+QD 


and  so  on. 

At  the  jth  step  (J  >  0)  the  active  clause  is  (understanding  y0  as  just  y  itself): 
3y,: [NUMBER [1+...  [1+yj]...]].  (with  j  1+*s) 
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and  the  next  resolution  can  be  obtained  by  choosing  either  the  second  procedure 
clause 

(*)  [NUMBER  0] 
orthe[j+1]st  variant 

(“)  VyK1:  [NUMBER  yM  ]  -»  [NUMBER  [1+  yj+1  JJ 

of  the  first  procedure  clause.  The  choice  of  (*)  will  yield  an  obvious  sequent  , 
and  the  output  will  be 

y  =  [1  + . . .  [1 + 0] . . .]  {with  j  1  +'s) 

The  choice  of  (**)  will  produce  the  new  active  clause 

ByH:  [NUMBER  [1+...[1+yK, ]...]]  ( with  j+1  U's). 

A  depth-first  backtracking  device  which  chose  (**)  before  (*)  at  each  level  would 
therefore  delay  permanently  the  choice  of  (*)  at  the  first  level,  and  never  reach  even 
the  first  success  node  in  the  search  tree.  Merely  reversing  the  order  of  the  choice 
would  produce  a  complete  {although  of  course  nonterminating)  traversal  of  the 
search  tree. 

Example  2.  The  two  procedure  clauses 
PI:  Vx:  [CAT [] x x] 

P2:  Vx,a,b,c:  [[CAT a  be]  ->  [CAT [•  x a]  b [•  x c]] ] 

intuitively  define  [CAT  x  y  z]  to  mean  that  the  list  z  is  the  result  of  conCATenating 
the  lists  x  and  y  in  that  order.  The  goal  clause 

Q:  3p,q:  [CAT  p  q  [•  1  [•  2  [•  3  NIL]]]  ] 

intuitively  says  that  the  list  [1  2  3]  is  p  ++  q,  i.e.,  the  result  of  concatenating  two 
lists,  p  and  q,  in  that  order.  The  minimal  Horn  sequent  {PI  P2]  =>  Q  is  true,  and 
has  four  different  LUSH  resolution  proofs,  the  simplest  of  which  is  obtained  by 
adding  just  one  resolution  invoking  the  procedure  clause  PI .  The  corresponding 
computation  has  the  output 

[p q]  =  [ NIL  [123]] 

describing  the  construction  [1  2  3]  -  [  ]++[1  2  3].  The  other  three  proofs  consist 
respectively  of  one,  two  and  three  successive  invocations  of  the  procedure  clause 
P2  followed  by  an  invocation  of  the  procedure  clause  PI.  The  outputs  of  the 
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corresponding  computations  describe  respectively  the  constructions 

[12  3]  =  [1]++[2  3],  [12  3]  =  [1  2]++[3],  [1  2  3]  =  [1  2  3]++[  ]. 

Another  view  of  LUSH  resolution  proofs.  Clark  [3]  has  pointed  out  an 
interesting  alternative  way  to  view  a  computation  with  output  Y  =  Y0  corresponding 
to  a  LUSH  resolution  proof  of  a  true  minimal  Horn  sequent 
p  =»(3Y)(G1  a.  . .  aG„). 

Namely,  it  can  be  interpreted  as  the  construction  of  n  separate  hyper-resolution 
proofs  by  which  n  different  unconditional  procedure  clauses 
(VX,)H, .  (VXn)Hn 

are  simultaneously  deduced  from  the  procedure  clauses  in  P,  and  which  are  such 
that  the  expressions 

[H,  ...  HJ  and  [  G,  ...  GJ 

are  unifiable  with  most  general  unifier  6. 

Hyper-resolution.  Hyper-resolution  is  an  inference  pattern  for  u.d.  clauses 
which  requires  a  conditional  procedure  clause 

(1)  (VX)(A,  a  . . .  a  A|<  — >  B) 

with  k  a  1  goals  A, . A*,  as  major  premise,  and  k  unconditional  procedure 

clauses 

(2)  (VX,)B, . (VXH)Bk, 

separated  from  it  and  from  each  other,  as  minor  premises,  such  that  the 
expressions 

[A,...Ak]  and  [  B, . . .  Bk  ] 

are  unifiable  with  most  general  unifier  a.  The  conclusion  of  the  hyper-resolution 
inference  is  then  the  unconditional  procedure  clause 

(3)  (VZ)C 

where  C  =  Bo,  and  Z  =  (XuXtu...  u  Xk)o  n  VARS.  It  is  straightforward  to 
verify  that  the  clause  (3)  is  indeed  a  logical  consequence  of  the  clauses  (2) 
together  with  the  clause  (1 ). 

One  may  use  this  inference  pattern  to  obtain,  from  a  set  P  of  procedure  clauses,  a 
series  P0,  P1(  ...,'  of  sets  of  procedure  clauses  all  of  which  are  logical 
consequences  of  P,  as  follows.  The  set  P0  is  the  set  of  unconditional  procedure 
clauses  in  P.  The  set  Pj+1  is  the  result  of  adding  to  the  set  Pj  all  the  unconditional 
procedure  clauses  which  can  be  inferred  by  a  single  application  of 
hyper-resolution  with  major  premise  in  the  set  P  and  minor  premises  in  the  set 
P^.  Clearly  the  union 

Pk  ■  P0  u  P,  u  ...  u  Pk 

of  the  first  k  of  these  sets  contains  all  the  unconditional  procedure  clauses 
deducible  from  P  by  no  more  than  k  steps  of  hyper-resolution.  It  is  natural  to 
organize  a  deduction  from  P  of  such  an  unconditional  procedure  clause  as  a  tree 
with  that  clause  as  its  root  and  with  members  of  P0  as  leaves.  Each  nonleaf  node 
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in  the  tree  is  the  immediate  consequence,  by  hyper-resolution,  of  its  immediate 
descendents  in  the  tree  (as  minor  premises)  and  some  conditional  clause  in  P  as 
major  premise. 

LUSH  resolution  and  hyper-resolution.  Now  consider  again  a  LUSH 
resolution  proof  whose  initial  sequent  is  P  =>  (3Y)(G,  a  ...  a  Gn)  and  whose 
output  is  Y  =  Y0.  Let 
Co . C, 

be  the  successive  states  of  the  corresponding  computation  and  let 
D0,  • . . ,  D, 

be  the  successive  procedure  clauses  D{  such  that  D,  is  used  with  C,  to  get  Cj+1. 
Note  that 

C0  =  {G, . Gn). 

Clark  observes  that  the  ith  step  of  the  computation  can  be  interpreted  as  attaching 
the  goals  in  the  body  of  Dj  as  immediate  successors  to  the  goal  TCj.  These  goals 
then  become,  together  with  those  in  the  set  i-Cj,  the  goals  in  the  set  C*,.  In  this 
manner  n  trees  are  grown,  each  nonleaf  node  in  which  follows,  by  a  single 
application  of  hyper-resolution,  from  its  immediate  successors  (as  minor  premises) 
and  some  procedure  clause  in  P  (as  major  premise).  Clark’s  interpretation  is  most 
illuminating,  for  it  shows  where  the  extraordinary  freedom  of  the  LUSH  resolution 
scheme  (to  use  any  selection  function  whatsoever)  comes  from.  It  comes  from  the 
fact  that  it  does  not  matter  in  what  order  the  nodes  of  the  n  hyper-resolution  trees 
are  treated  as  these  trees  are  grown. 

Negation  as  failure.  The  preceding  discussion  has  dealt  only  with  the  pure 
Horn  clause  case  of  logic  programming.  In  practice,  one  can  work  (as  first 
explained  in  Clark  [2])  with  pseudo-Horn  clauses  by  generalising  the  definition  of 
Horn  clauses  in  the  following  way.  Instead  of  restricting  the  bodies  to  be  sets  of 
predications  («  unnegated  atomic  sentences)  we  can  allow  them  to  be  literals  ,  that 
is,  either  predications  or  negated  predications.  However,  both  the  definition  of 
procedure  clauses  as  having  a  head  containing  exactly  one  predication,  and  the 
definition  of  goal  clauses  as  having  an  empty  head,  are  retained.  The  basic 
computation  cycle  is  then  extended  to  deal  with  the  case  that  the  goal  TC(  can 
now  be  a  negated  predication,  say,  -G  (hence  not  unifiable  with  the  head  of  any 
procedure  clause).  If  G  is  a  ground  expression  an  attempt  is  made  to  prove  the 
sequent  P  =>  G.  This  attempt  can  have  three  outcomes: 

1  it  can  terminate  with  a  proof,  in  which  case  -G  is  "disproved" 
and  C,  has  no  successors 

2  it  can  terminate  but  without  finding  a  proof  ("finite  failure"), 
in  which  case  ("negation  by  failure")  ~G  is  "proved"  and 
Ct  has  the  successor  iC| 
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3  it  can  fail  to  terminate,  in  which  case  the  attempt  must  be 
eventually  aborted  without  any  decision  as  to  the  provability 
of  G  from  P. 

Evidently,  1  assumes  the  simple  consistency  of  P,  while  2  assumes  something 
like  the  completeness  of  P  ("if  G  were  true  it  would  be  provable  from  P")  for  ground 
literals.  We  cannot  discuss  further  here  this  important  and  interesting  topic,  which 
is  the  subject  of  much  current  research.  We  refer  the  reader  to  Clark  [2]  and  Lloyd 
[17]  for  further  details. 
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